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^  Abstract 

Model-based  recognition  is  concerned  with  comparing  a  shape  A ,  which 
is  stored  as  a  model  for  some  particular  object,  with  a  shape  B,  which  is 
found  to  exist  in  an  image.  If  A  and  B  are  close  to  being  the  same  shape, 
then  a  vision  system  should  report  a  match  and  return  a  measure  of  how 
good  that  match  is.  To  be  useful  this  measure  should  satisfy  a  number  of 
properties,  including:  (1)  it  should  be  a  metric,  (2)  it  should  be  invariant 
under  translation,  rotation,  and  change-of-scale,  (3)  it  should  be  reasonably 
easy  to  compute,  and  (4)  it  should  match  our  intuition  (i.e.,  answers  should 
be  similar  to  those  that  a  person  might  give).  We  develop  a  method  for 
comparing  polygons  that  has  these  properties.  The  method  works  for  both 
convex  and  nonconvex  polygons  and  runs  in  time  0(mn  log  mn)  where  m  is 
the  number  of  vertices  in  one  polygon  and  n  is  the  number  of  vertices  in  the 
other.  Wr  nlno  pirrirntj  Tnmr  examples  that ^t he  method  produces 

answers  that  are  intuitively  reasonable.  sk,***^  _ 
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1  Introduction 


A  problem  of  both  theoretical  and  practical  importance  in  computer  vision  is  that 
of  comparing  two  shapes.  To  what  extent  is  shape  A  similar  to  shape  B ?  Model- 
based  recognition  is  concerned  with  comparing  a  shape  A,  which  is  stored  as  a 
model  for  some  particular  object,  with  a  shape  B,  which  is  found  to  exist  in  an 
image.  If  .4  and  B  are  close  to  being  of  the  same  shape,  then  a  vision  system 
should  report  a  match  and  return  a  measure  of  how  good  that  match  is.  Hence, 
we  are  interested  in  defining  and  computing  a  cost  function  d(A,B)  associated 
with  two  shapes  A  and  B  that  measures  their  similarity. 

The  long-term  goal  of  this  research  is  to  develop  methods  of  comparing  ar¬ 
bitrary  shapes  in  two  or  three  dimensions.  Here,  we  restrict  our  attention  to 
polygonal  shapes  in  the  plane,  with  an  extension  to  the  case  in  which  a  boundary 
may  contain  circular  arcs  in  addition  to  straight  line  segments.  Our  technique  is 
designed  to  work  with  objects  for  which  the  entire  boundaries  are  known. 

Before  suggesting  a  similarity  measure  to  be  used  for  comparing  polygons,  we 
examine  several  properties  that  such  a  measure  d(-,  •)  should  have. 

•  It  should  be  a  metric. 

-  d(A,  B)  >  0  for  all  A  and  B. 

-  d(A,  B)  =  0  if  and  only  if  A  =  B.  We  expect  a  shape  to  resemble  itself. 

-  d(  A,  B)  —  d(B,  A)  for  all  A  and  B  (Symmetry).  The  order  of  compari¬ 
son  should  not  matter. 

-  d(A,  B)  +  d(B,  C)  >  d(A,  C)  for  all  A,  B,  and  C  (Triangle  Inequality). 

The  triangle  inequality  is  necessary  since  without  it  we  can  have  a  case  in 
which  d(A,B)  and  d(B,C )  are  both  very  small,  but  d(A,C)  is  very  large. 
This  is  undesirable  for  pattern  matching  and  visual  recognition  applications. 
If  A  is  very  similar  to  B  and  B  is  very  similar  to  C,  then  A  and  C  should 
not  be  too  dissimilar. 

•  It  should  be  invariant  under  translation,  rotation,  and  change-of-scale.  In 
other  words,  we  want  to  measure  shape  alone. 
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•  It  should  be  reaonably  easy  to  compute.  This  must  hold  for  the  measure  to 
be  of  practical  use. 

•  Most  important  of  all,  it  should  match  our  intuitive  notions  of  shape  resem¬ 
blance.  In  other  words,  answers  should  be  similar  to  those  that  a  human 
might  give.  In  particular,  the  measure  should  be  insensitive  to  small  per¬ 
turbations  (or  small  errors)  in  the  data.  For  example,  moving  a  vertex  by 
a  small  amount  or  breaking  a  single  edge  into  two  edges  should  not  have  a 
large  effect. 


Representation  of  Polygons 

A  standard  method  of  representing  a  simple  polygon  A  is  to  describe  its  bound¬ 
ary  by  giving  a  (circular)  list  of  vertices,  expressing  each  vertex  as  a  coordinate 
pair.  An  alternative  representation  of  the  boundary  of  a  simple  polygon  A  is  to 
give  the  turning  function  ©4(3).  The  function  @4(3)  measures  the  angle  of  the 
counterclockwise  tangent  as  a  function  of  the  arc-length  s,  measured  from  some 
reference  point  0  on  A’s  boundary.  Thus  ©4(0)  is  the  angle  v  that  the  tangent  at 
the  reference  point  0  makes  with  some  reference  orientation  associated  with  the 
polygon  (such  as  the  r-axis).  ©4(3)  keeps  track  of  the  turning  that  takes  place, 
increasing  with  left-hand  turns  and  decreasing  with  right-hand  turns  (see  Figure 
1).  Formally,  if  k(s)  is  the  curvature  function  for  a  curve,  then  k(s)  =  @'(s).  The 
curvature  function  /c(s)  is  frequently  used  as  a  shape  signature  ([HT],  [HW],  [SS] , 
[Wol],  [Wo2]). 

Other  authors  have  used  a  slightly  different  definition  of  the  turning  function 
in  which  @4(0)  is  defined  to  be  0.  Our  definition,  in  which  @4(0)  is  the  angle  of 
the  tangent  line  at  the  reference  point,  leads  to  a  simple  correspondence  between 
a  shift  of  ©4(s)  in  the  9  direction  and  a  rotation  of  A.  This  correspondence  is  less 
clear  for  the  alternate  definition. 

Without  loss  of  generality,  we  assume  that  each  polygon  is  rescaled  so  that  the 
total  perimeter  length  is  1;  hence,  ©4  is  a  function  from  [0, 1]  to  &.  For  a  convex 
polygon  A ,  ©4(3)  is  a  monotone  function,  starting  at  some  value  v  and  increasing 
to  v  -f  2x.  For  a  nonconvex  polygon,  ©4(3)  may  become  arbitrarily  large,  since 
it  accumulates  the  total  amount  of  turn,  which  can  grow  as  a  polygon  “spirals” 
inward.  Although  ©4(3)  may  become  very  large  over  the  interval  3  6  [0,1],  in 
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Figure  1:  Defining  the  turn  function  0(s). 

order  for  the  function  to  represent  a  simple  closed  curve,  we  must  have  0^(1)  = 
0.4(0)  +  27 r  (assuming  that  the  origin  0  is  placed  at  a  differentiable  point  along 
the  curve). 

The  domain  of  @4(3)  can  be  extended  to  the  entire  real  line  in  a  natural  way 
by  allowing  angles  to  continue  to  accumulate  as  we  continue  around  the  perimeter 
of  the  polygon  A.  Thus,  for  a  simple  closed  polygon,  the  value  of  ©4(3  4-  1)  is 
0.4(3)  +  27T  for  all  s.  Note  that  the  function  ©4(3)  is  well-defined  even  for  arbitrary 
(not  necessarily  simple  or  closed  or  polygonal)  paths  A  in  the  plane.  When  the 
path  is  polygonal,  the  turning  function  is  piecewise-constant,  with  jump  points 
corresponding  to  the  vertices  of  A. 

Representation  of  planar  curves  (and,  in  particular,  polygons)  in  terms  of  some 
function  of  arc  length  has  been  used  by  a  number  of  other  researchers  (e.g.,  [OW], 
[SS],  etc.)  in  computational  geometry  and  computer  vision.  We  use  this  represen¬ 
tation  to  compute  a  distance  function  for  comparing  two  simple  polygons  ( A  and 
B)  by  looking  at  natural  notions  of  distances  between  the  turning  functions  0^(s) 
and  ©0(3). 

The  function  ©4(3)  has  several  properties  which  make  it  especially  suitable  for 
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Figure  2:  Non-uniform  noise  is  problematic  for  this  distance  function. 

our  purposes.  It  is  piecewise-constant  for  polygons  (and  polygonal  paths),  mak¬ 
ing  computations  particularly  easy  and  fast.  By  definition,  the  function  &a(s)  is 
invariant  under  translation  and  scaling  of  the  polygon  A.  Rotation  of  A  corre¬ 
sponds  to  a  simple  shift  of  ©^(s)  in  the  0  direction.  Note  also  that  changing  the 
location  of  the  origin  O  by  an  amount  t  €  [0, 1]  along  the  perimeter  of  polygon  A 
corresponds  to  a  horizontal  shift  of  the  function  ©^(s)  and  is  simple  to  compute 
(the  new  turning  function  is  given  by  ©^(s  +  t)). 

We  formally  define  the  distance  function  between  two  polygons  A  and  B  as  the 
Lp  distance  between  their  two  turning  functions  ©^(s)  and  ©b(s),  minimized  with 
respect  to  vertical  and  horizontal  shifts  of  the  turning  functions  (in  other  words, 
we  minimize  with  respect  to  rotation  and  choice  of  reference  points).  For  p  =  2  we 
show  that  this  distance  function  can  be  computed  efficiently  in  0(n 2  log  n)  time 
for  polygons  with  n  vertices. 

One  possible  drawback  of  our  distance  function  is  that  it  may  be  unstable  under 
certain  kinds  of  noise,  in  particular,  non-uniform  noise.  For  example,  suppose  we 
have  a  triangle  with  one  very  wiggly  side  so  that  the  wiggly  side  accounts  for  most 
of  the  length  of  the  boundary  of  the  polygon.  Comparing  it  with  a  triangle,  we 
will  get  a  very  bad  (in  fact,  arbitrarily  bad)  match.  See  Figure  2.  Fortunately, 
in  many  computer  vision  applications  it  is  reasonable  to  assume  that  the  noise 
is  roughly  uniformly  distributed  over  the  sides  of  the  polygon,  in  which  case  the 
similarity  measure  we  define  performs  nicely.  See  Section  4  for  examples. 

Schwartz  and  Sharir  [SS]  have  defined  a  notion  of  distance  similar  to  ours. 
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However,  they  compute  an  approximation  based  on  discretizing  the  turning  func¬ 
tions  of  the  two  shapes  into  many  equally  spaced  points;  thus,  the  quality  of  the 
approximation  depends  on  the  number  of  points  chosen.  Our  approach,  on  the 
other  hand,  is  to  examine  the  combinatorial  complexity  of  computing  the  exact 
metric  function  between  two  polygon  boundaries,  using  only  the  original  vertices. 
Thus,  our  method  runs  in  time  0(n 2  log  n)  where  n  is  the  total  number  of  polygon 
vertices,  while  their  method  computes  an  approximate  distance  in  time  0(Hog  fc), 
where  k  >>  n  is  the  total  number  of  interpolation  points  used.  Furthermore,  [SS] 
suggest  “convexifying”  non-convex  polygons  in  order  to  compare  them,  as  their 
method  does  not  apply  to  non-convex  polygons.  Our  algorithm  applies  to  both 
convex  and  non-convex  simple  polygons  (and  even  to  some  types  of  non-simple 
polygons). 

The  remainder  of  this  paper  is  organized  as  follows.  In  Section  2  we  give  a 
formal  definition  of  the  distance  between  two  polygons  based  on  their  turning 
functions.  We  prove  that  this  function  is  a  metric  and  show  some  of  its  properties. 
These  results  are  used  in  Section  3  to  develop  an  0(n3)  algorithm  for  computing 
the  distance  between  two  polygons,  where  n  is  the  total  number  of  vertices;  we 
then  refine  this  algorithm  to  obtain  an  0(n 2  log  n)  running  time.  Section  4  contains 
examples  of  the  distance  function  computed  for  several  polygons  using  an  imple¬ 
mentation  of  our  method.  Section  5  is  a  summary  and  discussion  of  extensions 
and  further  research. 


2  A  Polygon  Distance  Function 

Consider  two  polygons  A  and  B  and  their  associated  turning  functions  &a(s)  and 
0B(s).  The  degree  to  which  A  and  B  are  similar  can  be  measured  by  taking  the 
distance  between  the  functions  0,t(s)  and  0g(s)  according  to  our  favorite  metric 
on  function  spaces  (e.g.,  Lp  metrics)  .  Define  the  Lp  distance  between  A  and  B  as 

6r(A,  B)  =  lie,  -  ©all,  =  (jf 1  le^s)  -  e«wr  <fa) ' , 
where  ||  •  ||p  denotes  the  Lp  norm. 

6P  has  some  undesirable  properties:  it  is  sensitive  to  both  rotation  of  polygon 
A  (or  B)  and  choice  of  reference  point  on  the  boundary  of  A  (or  B).  Since  rotation 
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and  choice  of  reference  point  are  arbitrary,  it  makes  more  sense  to  consider  the 
distance  to  be  the  minimum  over  all  such  choices.  If  we  shift  the  reference  point 
0  along  ,4's  boundary  by  an  amount  f,  then  the  new  turning  function  is  given  by 
@4(s  + t).  If  we  rotate  A  by  angle  8  then  the  new  function  is  given  by  @4(5)  4-  8. 
Thus,  we  want  to  find  the  minimum  over  all  such  shifts  t  and  rotations  8  (i.e.,  over 
all  horizontal  and  vertical  shifts  of  ©a(s)).  In  other  words,  we  want  to  solve  for 


dp( A,  B) 


(  min  /  |0^(s  +  <) 

<e[o,i]y0 

(  min  D£B(t,8)Y, 

\9€9  tg[0,l]  p  ) 


1^ 

p 


where 

Dp'B(t,  8)  =  fQ  |0^(3  4-  t)  -  &B(s)  4-  8\p  ds 


Lemma  1  dp(A,  B)  is  a  metric  for  all  p  >  0. 

Proof.  This  follows  from  the  fact  that  ||  •  ||p  is  a  metric  for  all  p  >  0.  ■ 

Lemma  2  For  any  fixed  value  of  t,  and  for  any  p  >  1,  D£  B(t,8 )  is  a  convex 
function  of  8. 

Proof.  For  fixed  s  and  fixed  t,  the  function  F(8)  =  (©^(s  4- 1)  —  ©s(s)  4-  8\p 
is  clearly  a  convex  function  by  the  convexity  of  G{y)  =  |j/|p  (for  p  >  1),  and 
integrating  a  convex  function  maintains  convexity.  I 

In  particular,  using  the  Li  metric,  Di  'B{t,B)  is  a  quadratic  function  of  8  for 
any  fixed  value  of  t.  This  holds  for  any  simple  closed  shapes,  but  it  is  especially 
easy  to  see  for  polygons. 

We  assume  from  now  on  that  A  and  B  are  polygons.  Then,  for  a  fixed  t,  the 
integral  jjj  |©/i(s  4- 1)  -  ©b(s)  4-  8\ 2  ds  can  be  computed  by  adding  up  the  value 
of  the  integral  within  each  strip  defined  by  a  consecutive  pair  of  discontinuities  in 
Qa(s)  and  ©b(s)  (see  Figure  3).  The  integral  within  a  strip  is  trivially  computed 
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Figure  3:  The  rectangular  strips  formed  by  the  functions  Qa{$)  and  Qg(s). 

as  the  width  of  the  strip  times  the  square  of  the  difference  |@a(s  +  t)  —  @fl(s)| 
(which  is  constant  within  each  strip).  Note  that  if  m  and  n  are  the  numbers  of 
vertices  in  A  and  B,  respectively,  then  there  are  m  +  n  strips  and  that  as  9  changes, 
the  value  of  the  integral  for  each  strip  is  a  quadratic  function  of  9. 

In  order  to  compute  d?(A,  B),  we  must  minimize  D*  B(t,  9)  over  all  t  and  9.  We 
begin  by  finding  the  optimal  9  for  any  fixed  value  of  t.  To  simplify  notation  in  the 
following  discussion,  we  use  f(s)  =  @*(s),  g(s)  =  0a(s),  and  h(t,9)  =  D?-B(t,9). 

Lemma  3  Let  h(t,9)  =  /o(/(s  -ft)  —  g(s)  +  9 )2  ds.  Then,  in  order  to  minimize 
h(t,9),  the  best  value  of  9  is  given  by 

r(t)  =  f\  g(s)-f(s  +  t))ds 

JO 

=  a  —  2  irt, 


where  a  =  fj  g(s)  ds  -  fj  f(s)  ds. 


Proof. 

+  2/(a  +  t)  —  2g(s))  ds 

=  +  2  +  ()-»(>))*■ 
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Lemma  2  assures  us  that  the  minimum  occurs  when  we  set  this  quantity  equal  to 
zero  and  solve  for  9.  Thus, 

0*(O  =  /  (g(s)  -  f(s  +  t))  ds. 

Jo 

Now. 


r  1  rt+l 

Jo  f(s  +  o  ds  =  jf  f(s)  ds 

r  1  rt+ 1 

=  I  f(s)  ds  +  [f(s  -  1)  +  2tt] 
Jt  J  i 

/■t  rt 

=  /  /(s)  ds  +  \  f(s)  ds  +  2nt 

Jt  Jo 

=  2?rt  +  /  f(s)  ds. 

Jo 


Thus, 


0m(t)  =  I  g(s)  ds  -2 jt<  -  /  /(s) 

JO  JO 


=  a  -  27ft. 


■ 

Substituting  the  expression  for  0*(t)  in  di(A,  B)  we  are  left  with  a  one- variable 
minimization  problem: 


d*(A,B) 


min  h(t,0m(t)) 
‘610.11 


[/(a  +  t)  -  ff(s)]2  ds  - 


3  Algorithmic  Details 

In  this  section  we  show  that  the  distance  function  achieves  its  minimum  at  one  of 
mn  discrete  points  on  [0, 1],  which  we  call  critical  events.  Recall  that  in  the  process 
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of  finding  d2(A,B)  we  have  to  shift  the  function  f(s)  to  f(s  +  t)  for  t  £  [0,1], 
During  this  shifting  operation  the  breakpoints  of  /  collide  with  the  breakpoints  of 
g.  We  define  a  critical  event  as  a  value  of  t  where  a  breakpoint  of  /  collides  with 
a  breakpoint  of  g.  Clearly  there  are  mn  such  critical  events  for  m  breakpoints  in 
/  and  n  breakpoints  in  g. 

Using  the  fact  that  the  minimum  is  obtained  at  a  critical  event,  we  present  a 
basic  algorithm  for  computing  d2(A,B)  that  runs  in  0(n3)  time  for  two  n-vertex 
polygons  (and  time  0(mn(m  +  n))  for  an  m  vertex  polygon  and  an  n  vertex 
polygon).  We  then  describe  how  to  modify  the  basic  method  to  improve  the 
runtime  to  0(n2  log  n)  (or  0(mn  log  mn)  for  unequal  numbers  of  vertices). 

Recall  that  d2(A,  B)  =  min t.e(h(t,8))? ,  where  h(t,9)  =  D2'B(t,9).  We  prove 
that  h(t,9)  has  properties  that  lead  to  efficient  algorithms  for  computing  our 
polygon  metric. 

Lemma  4  ///(•)  and  g(-)  are  two  piecewise-constant  functions  with  m  and  n  break¬ 
points  respectively,  then  for  constant  9, 

h(t,  9)  =  f  (f(s  +  t)-g(s)  +  9)2ds 
Jo 

is  piecewise  linear  as  a  function  oft,  with  mn  breakpoints. 

Proof.  We  give  a  geometric  proof.  First  recall  that  for  a  given  value  of  t  the 
discontinuities  in  /  and  g  define  a  set  of  m  +  n  rectangular  strips  (see  Figure  3). 
The  value  of  h(t ,  9)  is  simply  the  sum  over  all  these  strips  of  the  width  of  a  strip 
times  the  square  of  its  height.  Except  at  critical  events,  as  /  is  shifted  the  width 
of  each  strip  changes,  but  the  height  remains  constant.  Each  changing  rectangle 
contributes  to  changes  in  h(t,9).  If  t  is  the  amount  of  shift,  then  for  a  shrinking 
rectangle,  the  change  is  (— t)  times  the  square  of  the  height;  for  a  growing  rectangle 
the  change  is  (+t)  times  the  square  of  the  height.  Since  the  heights  are  constant, 
the  change  in  h(t,  9)  is  a  sum  of  linear  terms  and  is  therefore  linear.  Breakpoints 
in  h(t,9)  clearly  occur  at  each  of  the  mn  critical  events  where  a  discontinuity  of  / 
is  aligned  with  a  discontinuity  of  g.  I 

This  result  leads  to  a  straightforward  algorithm  for  computing  d2(A,  B).  Let 
( t" ,  9m)  be  the  location  of  the  minimum  value  of  h(t,  9).  By  the  preceeding  lemma, 
h(t,9m)  is  linear  as  a  function  of  t\  thus,  t *  must  be  at  one  of  the  mn  breakpoints 


of  h.  In  other  words,  we  can  find  the  minimum  by  checking  just  the  values  at 
breakpoints.  These  observations  lead  to  the  following  result. 

Corollary  5  The  distance  d2(A,B)  between  two  polygons  A  and  B  (with  m  and  n 
vertices)  can  be  computed  exactly  in  time  0(mn(m  +  n)). 

Proof.  We  prove  the  theorem  by  describing  the  algorithm.  Given  values  for  both  t 
and  9,  h(t,9)  can  be  computed  in  0(m  +  n)  time  by  adding  the  contributions  of  the 
m  +  n  rectangular  strips  between  /  and  g.  Let  Co,  Cj, . . . ,  cmn  be  the  critical  events 
that  occur  as  /  is  shifted  by  t.  By  the  preceeding  observations,  the  minimum 
occurs  when  t  equals  one  of  cq,  cj,  . . . ,  cmn.  Since  the  best  9  value  for  a  given  t  can 
be  found  in  constant  time  (Lemma  3),  we  simply  compute  h(t,9m(t))  in  0(m  +  n) 
time  for  each  of  these  critical  events,  find  the  minimum,  and  take  its  square  root 
to  get  d2(  A,  B).  I 


Refinement  of  the  Algorithm 

Theorem  6  The  distance  d2(A,B)  between  two  polygons  A  and  B  (with  m  and  n 
vertices)  can  be  computed  exactly  in  time  0(mn  log  mn). 


Proof.  We  prove  the  theorem  by  describing  the  algorithm.  The  basic  idea  is 
the  same  as  the  previous  algorithm:  we  compute  h(t,9*(t))  for  each  of  the  critical 
values  of  t.  First,  note  that  we  only  need  to  compute  the  critical  values  of  h(t,  0), 
since  by  Lemma  3  the  effect  due  to  9m(t)  can  be  calculated  separately.  Second,  we 
observe  that  h(t,  0)  changes  in  a  very  constrained  fashion.  As  a  matter  of  fact,  by 
keeping  track  of  a  small  set  of  values  we  cam  easily  determine  how  the  function 
A(t,0)  changes  at  each  critical  event. 

The  values  we  keep  track  of  are  based  on  the  rectangular  strips  that  appear 
between  the  two  functions  /(s)  and  g(s).  (To  simplify  notation,  we  use  /  and  g 
in  place  of  0^  and  0g,  respectively.)  Recall  that  g(s)  is  fixed  in  place  and  that 
f(s)  is  shifted  by  t.  For  a  given  value  of  t,  the  discontinuities  in  f(s  +  t)  and  g(s) 
define  a  set  of  rectangular  strips,  as  was  illustrated  in  Figure  3.  Each  rectangular 
strip  has  /  at  the  top  and  g  at  the  bottom  or  vice-versa.  The  sides  of  a  strip  are 
determined  by  discontinuities  in  /  and  g. 
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For  the  purposes  of  the  algorithm,  we  separate  the  strips  into  four  groups  based 
on  the  discontinuities  at  the  sides  of  the  strips:  Rjj  for  those  with  /  on  both  sides; 
Rgg  for  those  with  g  on  both  sides;  Rjg  for  those  with  /  on  the  left  and  g  on  the 
right;  and  Rgj  for  those  with  g  on  the  left  and  /  on  the  right.  The  sets  R/g  and 
Rgf  are  particularly  important,  as  these  are  the  strips  whose  widths  change  as  t 
changes  (as  /  is  shifted).  Thus,  these  strips  affect  the  slope  of  h(t,  0). 

We  keep  track  of  two  quantities:  H/g  and  Hg /.  Hfg  is  the  sum  of  the  squares 
of  the  heights  of  all  the  strips  in  Rfg,  and  Hgj  is  the  sum  of  the  squares  of  the 
heights  of  all  the  strips  in  Rgf. 

The  algorithm  is  based  on  the  observation  that  for  values  of  t  between  two 
critical  events  the  slope  of  h(t,  0)  is  Hgf  —  H/g.  This  follows  from  the  fact  that,  as 
/  is  shifted  by  f,  Rfg  is  the  set  of  all  strips  that  increase  in  width  by  t,  and  Rgf 
is  the  set  of  all  strips  that  decrease  in  width  by  t.  The  widths  of  the  Rff  and  Rgg 
strips  remain  unchanged. 

Consider  what  happens  at  one  of  the  critical  events,  where  the  change  is  no 
longer  simply  linear.  We  claim  that  the  quantities  Hjg  and  Hgf  can  be  easily 
updated  at  these  points.  To  see  this  note  that,  at  a  critical  event,  an  fg- type  strip 
disappears  (its  width  goes  to  zero)  and  a  new  sr/-type  strip  appears  (see  Figure  3). 
At  the  same  time,  the  right  boundary  of  the  adjacent  strip  to  the  left  is  converted 
from  /  to  g,  and  the  left  boundary  of  the  adjacent  strip  to  the  right  is  converted 
from  g  to  /.  To  update  Hjg  and  Hgj  we  need  to  know  just  the  values  of  /  and  g 
around  the  critical  event. 

This  gives  us  the  following  algorithm: 


1.  Initialize: 

•  Given  the  piecewise  constant  functions  /  and  g,  determine  the  critical 
events:  the  shifts  of  /  by  t  such  that  a  discontinuity  in  /  coincides  with 
a  discontinuity  in  g.  Sort  these  critical  events  by  how  far  /  must  be 
shifted  for  each  event  to  occur.  Let  co,ci, . . .  ,Ce  be  the  ordered  list  of 
shifts  for  the  critical  events;  Co  =  0. 

•  Calculate  h( 0,0).  This  involves  summing  the  contributions  of  each  of 
m  4-  n  strips  and  takes  linear  time. 

•  Determine  initial  values  for  Hjg  and  Hgj. 
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2.  For  i  =  1  to  e 

•  Determine  the  value  of  /i(c,, 0)  =  ( Hgf  —  Hgf)(c,  -  c,_i)  +  h(c,_i,0). 

•  Update  Hfg  and  Hgj. 


The  algorithm  takes  advantage  of  the  fact  that  h(t,  0)  is  piecewise  linear  as  a 
function  of  t\  thus,  the  entire  function  can  be  determined  once  we  know  an  initial 
value  and  the  slope  for  each  piece.  It  is  easy  to  see  that  the  time  for  initialization 
is  dominated  by  the  time  it  takes  to  sort  the  critical  events:  0(e  loge),  where  e 
is  the  number  of  critical  events,  or  0(mn  log  mn)  where  m  and  n  are  the  sizes  of 
the  two  polygons.  The  updates  required  for  the  remainder  of  the  algorithm  take 
a  total  of  0(e),  or  0(m  +  n)  time.  1 

In  practice,  it  might  be  useful  to  recalculate  h(t ,  0)  periodically  from  scratch 
to  avoid  errors  that  could  accumulate.  If  this  is  done  every  0(^^)  steps  then  the 
time  bound  for  the  entire  algorithm  remains  O(eloge). 


4  Examples 

In  this  section  we  illustrate  some  of  the  qualitative  aspects  of  the  distance  function 
d2(A,  B)  by  comparing  some  simple  polygons  using  the  algorithm  described  in  the 
previous  section.  In  addition  to  providing  a  distance,  d2(A,  B),  between  two  poly¬ 
gonal  shapes,  the  method  gives  the  relative  orientation,  9 *,  and  the  corresponding 
reference  points  of  the  two  polygons  for  which  this  distance  is  attained. 

The  first  example  compares  two  simple  polygons  that  are  very  similar  in  shape, 
but  which  are  at  different  orientations  (see  Figure  4).  The  value  of  d2(A,  B)  is  0.16 
which  is  attained  at  a  rotation  of  180  degrees  and  with  the  upper  left  vertex  of  the 
first  polygon  matched  with  the  lower  right  vertex  of  the  second  one.  (Distances 
less  than  about  1.0  seem  to  correspond  to  polygons  that  a  person  would  rate  as 
resembling  each  other;  pairs  of  polygons  that  are  very  different  can  have  arbitrarily 
high  distances.) 

To  illustrate  how  the  distance  function  can  be  used  to  compare  a  model  with 
several  different  instances,  we  consider  the  case  of  matching  a  triangle  to  three 
different  shapes:  another  triangle,  a  “cut-off”  triangle,  and  a  quadrilateral,  as 
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Figure  4:  Comparing  two  simple  polygons. 


Figure  5:  Comparing  several  polygons. 
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Figure  6:  A  rectangle  with  a  notch  removed. 

shown  in  Figure  5.  The  distances  for  these  three  matches  are  .09,  .14  and  .23, 
respectively.  The  three  corresponding  rotations  are  175,  176  and  184  degrees. 
Thus  we  see  that  the  triangle  is  the  best  match,  the  cut-off  triangle  matches 
second  best,  and  the  trapezoid  is  the  worst  match.  The  match  to  the  cut-off 
triangle  suggests  that  the  metric  is  useful  for  matching  partially  occluded  objects, 
as  long  as  the  overall  shape  of  the  object  does  not  change  too  radically. 

Our  metric  also  provides  a  qualitatively  good  estimate  of  a  match  when  one 
polygon  is  an  instance  of  another,  but  with  some  perturbation  of  its  boundary.  A 
simple  example  is  given  by  the  cut-off  triangle  in  the  previous  figure,  where  the 
orientation  estimate  is  still  about  180  degrees  even  though  the  number  of  vertices 
has  changed.  Another  example  is  given  in  Figure  6,  where  we  compare  a  model 
rectangle  against  a  second  rectangle  and  against  another  rectangle  with  a  notch 
removed.  The  distances  are  .07  and  .11,  respectively,  with  relative  orientations  of 
2  and  1  degrees. 

An  extreme  case  of  matching  distorted  polygons  is  shown  in  Figure  7,  where  a 
triangle  is  compared  with  a  somewhat  triangular  shape.  In  this  case  the  distance  is 
.22,  and  the  orientation  difference  is  2  degrees.  Note  however,  that,  as  mentioned  in 
the  introduction  (see  Figure  2),  such  perturbations  must  occur  relatively  uniformly 
along  the  perimeter  of  the  polygon  for  the  match  to  be  reasonable. 
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Figure  7:  Matching  a  triangle  to  a  highly  noisy  shape. 


5  Summary  and  Discussion 

We  have  suggested  using  the  L 2  metric  on  the  turning  functions  of  polygons  as 
a  way  to  implement  the  intuitive  notion  of  shape-resemblance.  This  method  for 
comparing  shapes  has  the  following  advantages: 

•  It  is  a  metric  on  polygonal  shapes. 

•  It  compares  shape  alone;  it  is  invariant  under  translation,  rotation,  and 
change-of-scale. 

•  It  is  reasonably  easy  to  compute,  taking  time  0(mn  log  mn)  to  compare  an 
m  vertex  polygon  against  am  n  vertex  polygon. 

•  Finally,  it  corresponds  well  to  intuitive  notions  of  shape  resemblance. 

In  addition,  this  metric  works  for  nonconvex  as  well  as  convex  polygons,  and  even 
works  for  some  polygonal  shapes  that  are  not  simple  and/or  not  closed. 

Like  the  method  developed  by  Schwartz  and  Sharir  [SS],  our  method  is  actually 
based  on  a  convolution.  Recall  that  the  major  portion  of  our  algorithm  is  devoted 
to  minimising  h(t,  9)  =  Jo  (/(s  +  <)  —  g(s)  +  9)2  ds.  When  this  formula  is  multiplied 
out,  all  the  terms  depend  on  /  alone  or  g  alone,  except  for  the  convolution  term 
Jo  f(3  +  ds.  If  /  and  g  are  piecewise  constant  with  m  and  n  discontinu¬ 

ities,  respectively,  then  each  term  can  be  calculated  in  either  0(m )  or  0(n)  time 


Figure  8:  A  match  on  which  the  L\  metric  does  poorly  but  the  Li  does  well. 

except  for  the  convolution  term,  which  seems  to  require  0( mn  log  mn)  time.  Of 
course,  the  Fast  Fourier  Transform  (FFT)  can  be  used  to  compute  a  convolution 
in  0(fc  log  k)  time,  but  this  requires  k  evenly  spaced  sample  points  for  each  of 
/  and  g.  For  our  problem,  the  discontinuities  are  not  necessarily  evenly  spaced, 
so  the  FFT  cannot  be  used  unless  we  are  willing  to  approximate  our  functions  / 
and  g.  A  good  approximation  may  require  more  than  mn  points.  In  any  case, 
the  development  of  a  fast  method  for  convolutions  using  unevenly  spaced  sample 
points  would  lead  to  improvements  in  the  time  bound  for  our  technique. 

We  used  the  Li  metric,  but  similar  techniques  can  be  used  to  develop  polygon- 
resemblance  metrics  that  are  based  on  different  function-space  metrics.  Unfor¬ 
tunately,  not  all  such  metrics  have  Li's  advantages  of  being  reasonbly  easy  to 
compute  and  matching  our  ituitive  idea  of  shape  resemblance.  For  instance,  it  is 
also  possible  to  compute  the  L\  metric  on  two  0(s)  functions  using  an  algorithm 
similar  to  that  in  Section  3.  In  the  case  of  the  L\  metric,  however,  the  value  of  6‘ 
is  not  given  directly  for  each  value  of  s  as  it  is  for  the  Li  metric.  Thus  for  each  of 
the  mn  critical  events,  the  optimal  value  of  9  must  be  computed  explicitly.  Using 
a  data  structure  similar  to  that  in  Section  3,  the  overall  computation  can  be  done 
in  time  0(n3logn),  as  opposed  to  0(n2logn)  for  the  Li  metric. 

The  L\  metric  has  an  additional  drawback:  The  optimal  match  will  occur 
when  one  side  of  polygon  A  is  at  the  same  orientation  as  some  side  of  polygon  B. 
That  is,  the  value  of  9*  is  such  that  in  some  strip  the  two  functions  f(s  + 1)  and 
g(s)  are  coincident.  In  contrast,  the  Li  metric  finds  the  optimal  orientation  (in 
a  least  squares  sense)  without  requiring  any  two  edges  to  be  identically  oriented. 
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Examine  Figure  8  to  see  why  requiring  identical  orientations  can  be  undesirable; 
for  the  Li  metric  the  best  match  occurs  at  an  orientation  difference  of  76  degrees, 
bringing  two  edges  into  alignment.  This  would  rotate  the  two  figures  so  that  they 
approximately  form  a  star,  a  bad  match.  In  contrast,  for  the  X2  metric  the  best 
match  is  at  an  orientation  difference  of  7  degrees,  which  agrees  quite  well  with  our 
intuitive  sense  of  the  best  match. 

It  may  be  possible  to  apply  our  methods  to  problems  involving  partially  oc¬ 
cluded  objects,  that  is  objects  for  which  the  entire  model  is  known,  but  for  which 
only  a  portion  of  the  boundary  appears  in  the  image.  Our  technique  as  presented 
here  has  not  been  designed  to  work  with  such  objects,  although,  as  shown  by  some 
of  our  examples,  it  seems  to  give  intuitively  correct  answers  when  objects  are  not 
severely  occluded.  The  combination  of  occluded  objects  and  our  desire  to  make 
our  metric  independent  of  change-of-scale  causes  some  difficulty.  We  were  able  to 
control  change-of-scale  problems  by  normalizing  our  polygons  to  make  all  perime¬ 
ters  have  length  one.  If  portions  of  a  boundary  are  unknown  then  it  is  unclear  how 
this  normalization  should  be  done.  Of  course,  if  the  scale  of  the  image  is  known, 
then  partially  occluded  objects  do  not  present  any  difficulties. 

Our  results  can  be  generalized  to  include  cases  in  which  the  shapes  A  and  B 
have  some  or  all  of  their  boundary  represented  as  circular  arcs.  If  A  includes  some 
circular  arcs  on  its  boundary,  then  the  turning  function  ©^(s)  is  piecewise  linear 
instead  of  piecewise  constant.  As  before,  to  compare  shapes  A  and  B  we  need  to 
minimize 

h(t,8)  =  C  (©4(3  4  t)  -  efl(s)  4-  8\2  da. 

Jo 

The  derivation  of  8m(t),  the  best  value  of  8  for  given  f,  does  not  change  at  all;  it  is 
still  a  quadratic  function  of  t.  But  if  the  shapes  include  circular  arcs,  then  h(t,  0) 
can  be  piecewise  cubic  instead  of  simply  piecewise  linear.  Thus,  the  minimum  value 
of  h(t,  8)  does  not  necessarily  occur  at  a  critical  event,  and  more  information  is 
needed  in  order  to  determine  the  behavior  of  h  between  critical  events. 

However,  since  h(t,  0'(t))  is  piecewise  cubic  as  a  function  of  t,  we  can  determine 
the  behavior  of  h  between  critical  events  if  we  have  enough  data  points  for  h.  If 
we  collect  two  additional  (t,  h(t,0m(t)))  pairs  between  each  adjacent  pair  of  critical 
events  then  we  can  determine  the  coefficients  of  h(t,  0*(t))  (=  a^3  4-  atf2  4-  a^t  +  a4 
between  critical  events)  and  compute  the  minimum  analytically.  Thus,  the  basic 
approach  of  calculating  h(t,0)  in  0(m  4-  n)  time  for  each  of  0(mn)  values  is  still 
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valid  and  there  is  an  algorithm  that  runs  in  time  0(mn(m  +  n)). 

The  0(mn  log  mn)  time  algorithm  can  also  be  generalized  to  work  with  circular 
arcs,  and  the  resulting  algorithm  has  the  same  asymptotic  time  bound.  Unfortu¬ 
nately,  the  number  of  updates  needed  and  the  resulting  accumulation  of  round-off 
error  may  make  this  algorithm  impractical. 
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